Skip to content

ecmult_multi: Replace scratch space with malloc, use abcd cost model#1789

Open
fjahr wants to merge 2 commits intobitcoin-core:masterfrom
fjahr:2025-11-mem-multi-var
Open

ecmult_multi: Replace scratch space with malloc, use abcd cost model#1789
fjahr wants to merge 2 commits intobitcoin-core:masterfrom
fjahr:2025-11-mem-multi-var

Conversation

@fjahr
Copy link
Contributor

@fjahr fjahr commented Dec 15, 2025

This is an implementation of the discussed changes from an in-person meeting in October. It removes usage of scratch space in batch validation and replaces it with internal malloc usage. It also adds an ABCD cost model for algorithm selection.

The API and internals follow the drafted spec from the meeting very closely: https://gist.github.com/fjahr/c2a009487dffe7a1fbf17ca1821976ca There are few minor changes that should not change the intended behavior. The test coverage may be currently a bit lower than it was previously. I am guessing an adapted form of the test_ecmult_multi test should be added back and there are some TODOs left in the test code which I am planning to address after a first round of conceptual feedback.

The second commit demonstrates the calibration tooling that I have been using though it's not exactly what has given me the results that are in the PR. I have still been struggling with the calibration code and seem to never really get a result that just works without manual tweaking. The calibration itself as well as the code added there thus is rather a work in progress. I am assuming some version of the calibration code should be added to the repo and I haven't thought much about what the best place to add it is. Putting it into the benchmark and combining it with the python script was just a convenient way for experimentation. I am very open to suggestions on how to change this.

@fjahr fjahr force-pushed the 2025-11-mem-multi-var branch from 68dd612 to 8b62524 Compare December 16, 2025 21:21
return 1;
}

int secp256k1_musig_pubkey_agg(const secp256k1_context* ctx, secp256k1_xonly_pubkey *agg_pk, secp256k1_musig_keyagg_cache *keyagg_cache, const secp256k1_pubkey * const* pubkeys, size_t n_pubkeys) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to pass mem_limit as an input argument rather than defining it internally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I see the point that this would align with the batch validation API that has been discussed but I am not sure that this justifies changing the existing API of musig. It would probably better to have this in a separate/follow-up PR but I will wait a bit to see what other reviewers think about this.

* values from the output.
*/
static void run_ecmult_multi_calib(bench_data* data) {
static const size_t batch_sizes[] = {10, 20, 30, 50, 75, 100, 150, 200, 300, 500, 750, 1000, 1500, 2000, 3000, 5000, 7500, 10000, 15000, 20000, 30000};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it woulbe nice to have more batch sizes in the crossover region of strauss and pippenger (~80-150 points). So, something like this:

    static const size_t batch_sizes[] = {
        /* Small (Strauss region) */
        5, 10, 15, 20, 30, 50, 70,
        /* Crossover region */
        85, 88, 90, 100, 120, 150, 175,
        /* Medium (Pippenger small windows, w=6..8) */
        200, 300, 500, 750, 1000, 1200,
        /* Large (Pippenger large windows, w=9..12) */
        1500, 2000, 3000, 5000, 7500,
        10000, 15000, 20000, 30000
    };

Copy link
Contributor

@siv2r siv2r Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably cap the batch size for Strauss calibration (7,500 or less?). At higher sizes, the benchmark results curve and become outliers, which will skew the linear regression model. Since ecmult_multi_select won't choose Strauss for large batches anyway, we can avoid these sizes. I've attached a visualization below.

STRAUSS

@siv2r
Copy link
Contributor

siv2r commented Dec 27, 2025

I ran the calibration tool, and I'm getting the following results. It's very close the exisitng results, uptil PIPPENGER_8 after which the C value increases for me (this shouldn't happen right?). I'll run this a few more times.

My ABCD values
static const struct secp256k1_ecmult_multi_abcd secp256k1_ecmult_multi_abcds[SECP256K1_ECMULT_MULTI_NUM_ALGOS] = {
    {0,                                     0,                                     1000,  0     },
    {SECP256K1_STRAUSS_POINT_SIZE,          0,                                     113,   143  },
    {SECP256K1_PIPPENGER_POINT_SIZE(1),  SECP256K1_PIPPENGER_FIXED_SIZE(1),  199,   303  },
    {SECP256K1_PIPPENGER_POINT_SIZE(2),  SECP256K1_PIPPENGER_FIXED_SIZE(2),  152,   460  },
    {SECP256K1_PIPPENGER_POINT_SIZE(3),  SECP256K1_PIPPENGER_FIXED_SIZE(3),  117,   782  },
    {SECP256K1_PIPPENGER_POINT_SIZE(4),  SECP256K1_PIPPENGER_FIXED_SIZE(4),  100,   1158 },
    {SECP256K1_PIPPENGER_POINT_SIZE(5),  SECP256K1_PIPPENGER_FIXED_SIZE(5),  86,    1837 },
    {SECP256K1_PIPPENGER_POINT_SIZE(6),  SECP256K1_PIPPENGER_FIXED_SIZE(6),  77,    3013 },
    {SECP256K1_PIPPENGER_POINT_SIZE(7),  SECP256K1_PIPPENGER_FIXED_SIZE(7),  72,    4845 },
    {SECP256K1_PIPPENGER_POINT_SIZE(8),  SECP256K1_PIPPENGER_FIXED_SIZE(8),  69,    8775 },
    {SECP256K1_PIPPENGER_POINT_SIZE(9),  SECP256K1_PIPPENGER_FIXED_SIZE(9),  73,    14373},
    {SECP256K1_PIPPENGER_POINT_SIZE(10), SECP256K1_PIPPENGER_FIXED_SIZE(10), 78,    26442},
    {SECP256K1_PIPPENGER_POINT_SIZE(11), SECP256K1_PIPPENGER_FIXED_SIZE(11), 80,    48783},
    {SECP256K1_PIPPENGER_POINT_SIZE(12), SECP256K1_PIPPENGER_FIXED_SIZE(12), 106,   88289},
};

I’ve visualized the benchmarks for all algorithms (see siv2r@3119386 for the diagrams). I’m planning to do a deeper dive into the regression model fit to ensure they align with these results. Will share any useful findings here

@siv2r
Copy link
Contributor

siv2r commented Dec 27, 2025

I did some analysis on the linear regression model for the CD values (see siv2r@b5985a6). I basically ran ./bench_ecmult calib twice. Used the first run to compute the linear regression model, and the second run to test it against new benchmark values.

ABCD Values
static const struct secp256k1_ecmult_multi_abcd secp256k1_ecmult_multi_abcds[SECP256K1_ECMULT_MULTI_NUM_ALGOS] = {
    {0,                                     0,                                     1000,  0     },
    {SECP256K1_STRAUSS_POINT_SIZE,          0,                                     112,   173  },
    {SECP256K1_PIPPENGER_POINT_SIZE(1),  SECP256K1_PIPPENGER_FIXED_SIZE(1),  197,   285  },
    {SECP256K1_PIPPENGER_POINT_SIZE(2),  SECP256K1_PIPPENGER_FIXED_SIZE(2),  152,   480  },
    {SECP256K1_PIPPENGER_POINT_SIZE(3),  SECP256K1_PIPPENGER_FIXED_SIZE(3),  117,   767  },
    {SECP256K1_PIPPENGER_POINT_SIZE(4),  SECP256K1_PIPPENGER_FIXED_SIZE(4),  100,   1167 },
    {SECP256K1_PIPPENGER_POINT_SIZE(5),  SECP256K1_PIPPENGER_FIXED_SIZE(5),  86,    1887 },
    {SECP256K1_PIPPENGER_POINT_SIZE(6),  SECP256K1_PIPPENGER_FIXED_SIZE(6),  78,    3023 },
    {SECP256K1_PIPPENGER_POINT_SIZE(7),  SECP256K1_PIPPENGER_FIXED_SIZE(7),  73,    4906 },
    {SECP256K1_PIPPENGER_POINT_SIZE(8),  SECP256K1_PIPPENGER_FIXED_SIZE(8),  70,    8889 },
    {SECP256K1_PIPPENGER_POINT_SIZE(9),  SECP256K1_PIPPENGER_FIXED_SIZE(9),  74,    14544},
    {SECP256K1_PIPPENGER_POINT_SIZE(10), SECP256K1_PIPPENGER_FIXED_SIZE(10), 79,    26764},
    {SECP256K1_PIPPENGER_POINT_SIZE(11), SECP256K1_PIPPENGER_FIXED_SIZE(11), 84,    49179},
    {SECP256K1_PIPPENGER_POINT_SIZE(12), SECP256K1_PIPPENGER_FIXED_SIZE(12), 106,   89518},
};
Statistical analysis

Metric Definitions

Metric Meaning Good Value for Benchmarking
How well the linear model fits (0-1) > 0.95 (strong linear relationship)
Std Error Uncertainty in slope estimate Low relative to slope
p-value Probability slope = 0 by chance < 0.05 (statistically significant)
Slope Time increase per additional n (μs) Positive, algorithm-dependent
Intercept Fixed overhead time (μs) Algorithm-dependent

Per-Algorithm Metrics

Algorithm Std Error p-value Slope Intercept
STRAUSS 0.983536 0.1660 1.29e-25 6.2956 -1917.23
PIPPENGER_1 0.999995 0.0039 6.29e-73 8.5860 -133.20
PIPPENGER_2 0.999992 0.0036 9.98e-71 6.7066 -115.04
PIPPENGER_3 0.999993 0.0026 8.15e-71 4.9746 -34.11
PIPPENGER_4 0.999996 0.0015 5.34e-75 4.1772 -13.27
PIPPENGER_5 0.999995 0.0015 1.16e-73 3.5473 56.56
PIPPENGER_6 0.999991 0.0019 2.30e-69 3.1035 105.03
PIPPENGER_7 0.999994 0.0013 4.69e-72 2.7083 219.62
PIPPENGER_8 0.999989 0.0016 2.64e-68 2.4879 414.58
PIPPENGER_9 0.999977 0.0021 3.76e-64 2.2872 719.46
PIPPENGER_10 0.999953 0.0028 5.02e-60 2.1705 1287.92
PIPPENGER_11 0.999847 0.0049 4.73e-53 2.0592 2321.53
PIPPENGER_12 0.999583 0.0075 3.62e-47 1.9331 4153.43

The linear fit ($R^2$) and standard error for STRAUSS is relatively worse. We can improve this by using smaller batch sizes when calibrating Strauss. I see negative intercepts (i.e., D values) for PIPPENGER_1..4, this should be positive. I think this could also be address by capping these algorithms to medium batch sizes during calibration. Otherwise, the calibrated values are excellent!

@fjahr fjahr force-pushed the 2025-11-mem-multi-var branch 2 times, most recently from 72fd3e8 to c2c9bc2 Compare January 20, 2026 16:27
@fjahr
Copy link
Contributor Author

fjahr commented Jan 20, 2026

Addressed the comments on the implementation code and rebased, thanks a lot for the review @siv2r ! I need a little more time to think about the calibration again, but will comment on that asap.

@fjahr fjahr force-pushed the 2025-11-mem-multi-var branch from c2c9bc2 to 4aa160c Compare January 25, 2026 16:48
@fjahr
Copy link
Contributor Author

fjahr commented Jan 25, 2026

@siv2r Thanks again for the review! The hint about using more narrow ranges for each of the algos in the calibration was awesome, I didn't think of this before and it pretty much fixed all the problems I ran into with the calibration results. I took these suggestions with some smaller additional tweaks, specifically with Strauss which still had quite a bit of variance. I see pretty stable calibration results between multiple runs now and the measurements between the different benchmarks are running smoother than any of my previous iterations.

These improvements have given me a lot more confidence in these changes, so taking this out of draft status too.

@fjahr fjahr marked this pull request as ready for review January 25, 2026 17:22
@fjahr fjahr force-pushed the 2025-11-mem-multi-var branch from 4aa160c to ef4754b Compare February 3, 2026 10:34
@fjahr
Copy link
Contributor Author

fjahr commented Feb 3, 2026

Randomly found #638 when looking at older PRs and found a small bug in my pippenger size calculation from comparing the overlapping code.

@fjahr fjahr force-pushed the 2025-11-mem-multi-var branch from ef4754b to b2cc6d1 Compare February 3, 2026 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants